# load package(s)
library(tidyverse)
library(ggrepel)
library(lubridate)
# load datasets
load("data/corruption.rda")
load("data/tech_stocks.rda")
# Read in the cdc dataset
cdc <- read_delim(file = "data/cdc.txt", delim = "|") |>
mutate(
genhlth = factor(
genhlth,
levels = c("excellent", "very good", "good", "fair", "poor")
)
)
# set seed
set.seed(86420)
# selecting a random subset of size 100
cdc_small <- cdc |> slice_sample(n = 100)
# Generating toy dataset for exercise 2
toy_data <- tibble(
theta = seq(0, 2 * pi, length.out = 100),
obs = rnorm(100, sin(theta), 0.1),
larger_than = if_else(abs(obs) > abs(sin(theta)), "1", "0")
)L06 Scales, Axes, & Legends
Data Visualization (STAT 302)
Overview
The goal of this lab is to explore ways to manage and manipulate scales, axes, and legends within ggplot2.
Datasets
We’ll be using the tech_stocks.rda, corruption.rda, cdc.txt, and a toy dataset.
Exercise 1
Using the tech_stocks dataset, recreate the following graphic as precisely as possible.
Hints:
key_glyphscalespackage will be useful- legend
linewidthis 1.3 - legend useful values: 0.75 and 0.85
- Eliminated extra space in horizontal direction
Exercise 2
Using the corruption.rda dataset, recreate the following graphic as precisely as possible.
Hints:
- Only use 2015 data
- Transparency is 0.6
"y ~ log(x)"; method"lm"; and color isgrey40- Point size is 3 in legend
- color palette is
"Set1" - Package
ggrepelbox.paddingis 0.6- Minimum segment length is 0
- seed is 9876
Exercise 3
Use toy_data to recreate the following graphic as precisely as possible.
Hints:
- Point sizes are 3
- Point colors:
#E66100,#5D3A9B - Point transparency is 0.8
stat_function()will be useful- line size is 1.3 and line color is
#56B4E9
- line size is 1.3 and line color is
quote()will be useful
Exercise 4
Using cdc_small, construct a scatterplot of weight by height with the following requirements:
- Size of plotting characters should be 3.
- Color and shape should both identify
genhlth. - Only one legend: for both color and shape.
- Legend title should be “General Health?” with a newline starting after general.
- Legend categories should be ordered from excellent (top) to poor (bottom) with each word in category capitalized in the legend.
- Legend should be placed in the lower right-hand corner of the plotting area.
- Color should follow the
"Set1"palette. - Shape should have a solid triangle (17) for excellent, solid circle (19) for very good, an
x(4) for poor, an hollow rotated square with anxin it (9) for fair, and a solid square (15) for good. heightvalues should be limited between 55 and 80.heightaxis should display every 5th number between 55 and 80 and be appropriately labeled (i.e.55 in,60 in, …,80 in). No axis title is necessary.weightvalues should be limited between 100 and 300.weightaxis should betransto log base 10 scale, but still display weights in pounds starting at 100 and displaying every 25 pounds until 300. Must be appropriately labeled (i.e.100 lbs,125 lbs, …,300 lbs). No axis title is necessary.- Graph title should be
CDC BRFSS: Weight by Height. - Minimal theme.